{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--BOOK_INFORMATION-->\n",
    "<a href=\"https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-opencv\" target=\"_blank\"><img align=\"left\" src=\"data/cover.jpg\" style=\"width: 76px; height: 100px; background: white; padding: 1px; border: 1px solid black; margin-right:10px;\"></a>\n",
    "*This notebook contains an excerpt from the book [Machine Learning for OpenCV](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-opencv) by Michael Beyeler.\n",
    "The code is released under the [MIT license](https://opensource.org/licenses/MIT),\n",
    "and is available on [GitHub](https://github.com/mbeyeler/opencv-machine-learning).*\n",
    "\n",
    "*Note that this excerpt contains only the raw code - the book is rich with additional explanations and illustrations.\n",
    "If you find this content useful, please consider supporting the work by\n",
    "[buying the book](https://www.packtpub.com/big-data-and-business-intelligence/machine-learning-opencv)!*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NAVIGATION-->\n",
    "< [Chaining Algorithms Together to Form a Pipeline](11.04-Chaining-Algorithms-Together-to-Form-a-Pipeline.ipynb) | [Contents](../README.md) |"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Wrapping Up\n",
    "\n",
    "Congratulations! You have just made a big step toward becoming a machine learning\n",
    "practitioner. Not only are you familiar with a wide variety of fundamental machine\n",
    "learning algorithms, you also know how to apply them to both supervised and\n",
    "unsupervised learning problems.\n",
    "\n",
    "Before we part ways, I want to give you some final words of advice, point you toward some\n",
    "additional resources, and give you some suggestions on how you can further improve your\n",
    "machine learning and data science skills."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Approaching a machine learning problem\n",
    "\n",
    "When you see a new machine learning problem in the wild, you might be tempted to jump\n",
    "ahead and throw your favorite algorithm at the problem—perhaps the one you understood\n",
    "best or had the most fun implementing. But knowing beforehand which algorithm will\n",
    "perform best on your specific problem is not often possible.\n",
    "\n",
    "Instead, you need to take a step back and look at the big picture.\n",
    "Here the book provides an easy-to-follow outline on how to approach machine learning problems in the wild (p.331ff.)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Writing your own OpenCV based classifier in C++\n",
    "\n",
    "Since OpenCV is one of those Python libraries that does not contain a single line of Python\n",
    "code under the hood (I'm kidding, but it's close), you will have to implement your custom\n",
    "estimator in C++."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The first step is to define a \n",
    "file `MyClass.cpp`:\n",
    "\n",
    "    #include <opencv2/opencv.hpp>\n",
    "    #include <opencv2/ml/ml.hpp>\n",
    "    #include <stdio.h>\n",
    "\n",
    "    class MyClass : public cv::ml::StatModel\n",
    "    {\n",
    "    public:\n",
    "        MyClass()\n",
    "        {\n",
    "            print(\"MyClass constructor\\n\");\n",
    "        }\n",
    "\n",
    "        ~MyClass() {}\n",
    "\n",
    "        int getVarCount() const\n",
    "        {\n",
    "            // returns the number of variables in the training samples\n",
    "            return 0;\n",
    "        }\n",
    "\n",
    "        bool empty() const\n",
    "        {\n",
    "            return true;\n",
    "        }\n",
    "\n",
    "        bool isTrained() const\n",
    "        {\n",
    "            // returns true if the model is trained\n",
    "            return false;\n",
    "        }\n",
    "\n",
    "        bool isClassifier() const\n",
    "        {\n",
    "            // returns true if the model is a classifier\n",
    "            return true;\n",
    "        }\n",
    "\n",
    "        bool train(const cv::Ptr<cv::ml::TrainData>& trainData, int flags=0) const\n",
    "        {\n",
    "            // trains the model\n",
    "            // trainData: training data that can be loaded from file using\n",
    "            //            TrainData::loadFromCSV or created with TrainData::create.\n",
    "            // flags:     optional flags, depending on the model. Some of the models\n",
    "            //            can be updated with the new training samples, not completely\n",
    "            //            overwritten (such as NormalBayesClassifier or ANN_MLP).\n",
    "            return false;\n",
    "        }\n",
    "\n",
    "        bool train(cv::InputArray samples, int layout, cv::InputArray responses)\n",
    "        {\n",
    "            // trains the model\n",
    "            // samples:   training samples\n",
    "            // layout:    see ml::SampleTypes\n",
    "            // responses: vector of responses associated with the training samples\n",
    "            return false;\n",
    "        }\n",
    "\n",
    "        float calcError(const cv::Ptr<cv::ml::TrainData>& data, bool test, cv::OutputArray resp)\n",
    "        {\n",
    "            // calculates the error on the training or test set\n",
    "            // data: the training data\n",
    "            // test: if true, the error is computed over the test subset of the data, otherwise\n",
    "            //       it's computed over the training subset of the data.\n",
    "            return 0.0f;\n",
    "        }\n",
    "\n",
    "        float predict(cv::InputArray samples, cv::OutputArray results=cv::noArray(), int flags=0) const\n",
    "        {\n",
    "            // predicts responses for the provided samples\n",
    "            // samples: the input samples, floating-point matrix\n",
    "            // results: the optional matrix of results\n",
    "            // flags:   the optional flags, model-dependent. see cv::ml::StatModel::Flags\n",
    "            return 0.0f;\n",
    "        }\n",
    "    };\n",
    "\n",
    "    int main()\n",
    "    {\n",
    "        MyClass myclass;\n",
    "        return 0;\n",
    "    }"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then create a file `CMakeLists.txt`:\n",
    "    \n",
    "    cmake_minimum_required(VERSION 2.8)\n",
    "    project(MyClass)\n",
    "    find_package(OpenCV REQUIRED)\n",
    "    add_executable(MyClass MyClass.cpp)\n",
    "    target_link_libraries(MyClass ${OpenCV_LIBS})"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then you can compile the file from the command line via `cmake` and `make`:\n",
    "    \n",
    "    $ cmake .\n",
    "    $ make"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Then run the file:\n",
    "    \n",
    "    $ ./MyClass\n",
    "    \n",
    "This should not generate any error, and print to console:\n",
    "    \n",
    "    MyClass constructor"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Writing your own Scikit-Learn based classifier in Python:\n",
    "\n",
    "Alternatively, you can write your own classifier using the scikit-learn library.\n",
    "\n",
    "You can do this by importing `BaseEstimator` and `ClassifierMixin`. The latter will\n",
    "provide a corresponding `score` method, which works for all classifiers. Optionally, you can\n",
    "overwrite the score method to provide your own.\n",
    "\n",
    "The following mixins are available:\n",
    "- `ClassifierMixin` if you are writing a classifier (will provide a basic `score` method)\n",
    "- `RegressorMixin` if you are writing a regressor (will provide a basic `score` method)\n",
    "- `ClusterMixin` if you are writing a clustering algorithm (will provide a basic `fit_predict` method)\n",
    "- `TransformerMixin` if you are writing a transformer (will provide a basic `fit_predict` method)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from sklearn.base import BaseEstimator, ClassifierMixin"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "class MyClassifier(BaseEstimator, ClassifierMixin):\n",
    "    \"\"\"An example classifier\"\"\"\n",
    "\n",
    "    def __init__(self, param1=1, param2=2):\n",
    "        \"\"\"Called when initializing the classifier\n",
    "        \n",
    "        The constructor is used to define some optional\n",
    "        parameters of the classifier. Store them as class\n",
    "        attributes for future access.\n",
    "        \n",
    "        Parameters\n",
    "        ----------\n",
    "        param1 : int, optional, default: 1\n",
    "            The first parameter\n",
    "        param2 : int, optional, default: 2\n",
    "            The second parameter\n",
    "        \"\"\"\n",
    "        self.param1 = param1\n",
    "        self.param2 = param2\n",
    "    \n",
    "    def fit(self, X, y=None):\n",
    "        \"\"\"Fits the classifier to data\n",
    "        \n",
    "        This should fit the classifier to the training data.\n",
    "        All the \"work\" should be done here.\n",
    "        \n",
    "        Parameters\n",
    "        ----------\n",
    "        X : array-like\n",
    "            The training data, where the first dimension is\n",
    "            the number of training samples, and the second\n",
    "            dimension is the number of features.\n",
    "        y : array-like, optional, default: None\n",
    "            Vector of class labels\n",
    "            \n",
    "        Returns\n",
    "        -------\n",
    "        The fit method returns the classifier object it\n",
    "        belongs to.\n",
    "        \"\"\"\n",
    "        return self\n",
    "    \n",
    "    def predict(self, X):\n",
    "        \"\"\"Predicts target labels\n",
    "        \n",
    "        This should predict the target labels of some data `X`.\n",
    "        \n",
    "        Parameters\n",
    "        ----------\n",
    "        X : array-like\n",
    "            Data samples for which to predict the target labels.\n",
    "        \n",
    "        Returns\n",
    "        -------\n",
    "        y_pred : array-like\n",
    "            Target labels for every data sample in `X`\n",
    "        \"\"\"\n",
    "        return np.zeros(X.shape[0])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The classifier can be instantiated as follows:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "myclass = MyClassifier()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can then fit the model to some arbitrary data:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "MyClassifier(param1=1, param2=2)"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X = np.random.rand(10, 3)\n",
    "myclass.fit(X)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And then you can proceed to predicting the target responses:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "myclass.predict(X)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Where to go from here?\n",
    "\n",
    "The goal of this book was to introduce you to the world of machine learning and prepare\n",
    "you to become a machine learning practitioner. Now that you know everything about the\n",
    "fundamental algorithms, you might want to investigate some topics in more depth.\n",
    "\n",
    "Although it is not necessary to understand all the details of all the algorithms we\n",
    "implemented in this book, knowing some of the theory behind them might just make you a\n",
    "better data scientist.\n",
    "\n",
    "Turn to the book to find a list of suggested reading materials, books, and machine learning software!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Summary\n",
    "\n",
    "In this book, we covered a lot of theory and practice.\n",
    "\n",
    "We discussed a wide variety of fundamental machine learning algorithms, be it supervised\n",
    "or unsupervised, illustrated best practices as well as ways to avoid common pitfalls, and we\n",
    "touched upon a variety of commands and packages for data analysis, machine learning, and\n",
    "visualization.\n",
    "\n",
    "If you made it this far, you have already made a big step toward machine learning mastery.\n",
    "From here on out, I am confident you will do just fine on your own.\n",
    "All that's left to say is farewell! I hope you enjoyed the ride; I certainly did."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<!--NAVIGATION-->\n",
    "< [Chaining Algorithms Together to Form a Pipeline](11.04-Chaining-Algorithms-Together-to-Form-a-Pipeline.ipynb) | [Contents](../README.md) |"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
